AITopics | nonsupport vector

Recent work shows that gradient descent on linearly separable data is implicitly biased towards the maximum margin solution. However, no convergence rate which is tight in both n (the dataset size) and t (the training time) is given. This work proves that the normalized gradient descent iterates converge to the maximum margin solution at a rate of O(ln(n)/ ln(t)), which is tight in both n and t. The proof is via a dual convergence result: gradient descent induces a multiplicative weights update on the (normalized) SVM dual objective, whose convergence rate leads to the tight implicit bias rate.

artificial intelligence, lemma 2, machine learning, (15 more...)

arXiv.org Machine Learning

1906.0454

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Scaling SVM and Least Absolute Deviations via Exact Data Reduction

Wang, Jie, Wonka, Peter, Ye, Jieping

arXiv.org Machine LearningOct-25-2013

The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, whose computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solvers. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.

artificial intelligence, machine learning, nonsupport vector, (16 more...)

arXiv.org Machine Learning

1310.7048

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Parallel Support Vector Machines: The Cascade SVM

Graf, Hans P., Cosatto, Eric, Bottou, Léon, Dourdanovic, Igor, Vapnik, Vladimir

Neural Information Processing SystemsDec-31-2005

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes th rough the Cascade, but already a single pass provides good generalization. A single pass is 5x - 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week.

cascade, support vector, vector, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Parallel Support Vector Machines: The Cascade SVM

Graf, Hans P., Cosatto, Eric, Bottou, Léon, Dourdanovic, Igor, Vapnik, Vladimir

Neural Information Processing SystemsDec-31-2005

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes th rough the Cascade, but already a single pass provides good generalization. A single pass is 5x - 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week.

cascade, support vector, vector, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > Germany > Saxony > Leipzig (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback

Parallel Support Vector Machines: The Cascade SVM

Graf, Hans P., Cosatto, Eric, Bottou, Léon, Dourdanovic, Igor, Vapnik, Vladimir

Neural Information Processing SystemsDec-31-2005

We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. Instead of analyzing the whole training set in one optimization step, the data are split into subsets and optimized separately with multiple SVMs. The partial results are combined and filtered again in a'Cascade' of SVMs, until the global optimum is reached. The Cascade SVM can be spread over multiple processors with minimal communication overhead and requires far less memory, since the kernel matrices are much smaller than for a regular SVM. Convergence to the global optimum is guaranteed with multiple passes through the Cascade, but already a single pass provides good generalization. A single pass is 5x - 10x faster than a regular SVM for problems of 100,000 vectors when implemented on a single processor. Parallel implementations on a cluster of 16 processors were tested with over 1 million vectors (2-class problems), converging in a day or two, while a regular SVM never converged in over a week.

artificial intelligence, machine learning, vector, (18 more...)

Neural Information Processing Systems

Country: Europe (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Add feedback